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Abstract 

Non-controllable finite-state channels (FSCs) are finite-state channels in which the user cannot control the channel 
state, i.e., the state evolves freely in time. Thus far, good upper bounds on the capacities as well as computable 
capacities of general non-controllable FSCs with/without feedback are unknown. Here we consider the delayed channel 
state as part of the channel input and then mathematically define the directed information from the new channel 
input (including the source and the delayed channel state) sequence to the channel output sequence. With this technique, 
upper bounds on the capacities of non-controllable FSCs with/without feedback are developed. The upper bounds are 
achieved by conditional Markov sources, conditioned on the delayed feedback and the delayed state information. A 
dynamic programming method is proposed to optimize conditional Markov sources and the bounds are numerically 
computed by Monte Carlo techniques. 

Index Terms 

Conditional Markov source, delayed feedback, delayed state information, directed information, dynamic program- 
ming, feedback capacity, feedforward capacity, non-controllable finite-state machine channel, upper bound. 

I. Introduction 

The capacity of a memory less channel without feedback was derived by Shannon in [1]. Furthermore, feedback 
doesn't increase the capacities of memoryless channels [2]. In contrast, there is no universal statement for the 
capacities of channels with memory used with/without feedback. Finite-state channels (FSCs) describe channels with 
finite memory such as finite-length intersymbol interference (ISI) channels [3] and Gilbert-Elliott (GE) channels [4]. 
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In [3], Hirt used a Monte Carlo method to evaluate bounds on the i.u.d. capacity Ci,u.d. of the ISI channel, which is 
defined as the information rate when the channel inputs are independent and uniformly distributed (i.u.d.). In recent 
years, [5-8] presented efficient Monte Carlo methods to evaluate Ci,u.d. and information rates of FSCs whose inputs 
are Markov random processes. A generalization of the well-known Blahut-Arimoto algorithm [9, 10] was proposed 
to numerically optimize (maximize) information rates for a given Markov process order. These methods, coupled 
with recent proofs [11, 12] that Markov processes asymptotically achieve feedforward capacities of FSCs, can be 
utilized to very closely lower-bound the feedforward capacities. Vontobel and Arnold introduced an upper bound 
on the feedforward capacity [13], which is not always numerically tight. So far, the delayed feedback capacity [14] 
is the numerically tightest upper bound on the feedforward capacity of the controllable FSC, in which the user can 
take the channel from any state into any desired state by selecting finite channel inputs. 

In 1990, Massey introduced the directed information [15], and showed that the directed information from the 
channel input sequence to the channel output sequence equals the mutual information between the input and 
output sequences if the channel is used without feedback. Massey also showed that the supremum of the directed 
information rate is an upper bound on the capacity if feedback is utilized. Tatikonda first proved that the supremum of 
the directed information rate of a class of FSCs with feedback is achievable, and introduced a dynamic programming 
framework to compute the directed information for Markov channels with feedback in [16]. Recently, some efficient 
dynamic programming algorithms to evaluate the feedback capacities were presented [14, 17-22]. In [14], a value 
iteration dynamic programming algorithm was introduced to compute the feedback and delayed feedback capacities 
of controllable FSCs, and in [19] the method was extended to power-constrained Gaussian noise channels with 
memory. However, the method in [14] is applicable only to the class of controllable FSCs. No similar tool has thus 
far been proposed for non-controllable FSCs, in which the user cannot control channel states. In addition, other 
coding theorems [23,24] based on directed information were also proposed for some specific FSCs with feedback. 
Differing from the method based on directed information, Viswanathan presented a single letter formula of the 
feedback capacity for the Markov channel with perfect channel state information at the receiver in [25]. 

In this paper, we consider general non-controllable indecomposable FSCs' [27] with/without feedback and 
introduce upper bounds on their capacities. We develop an upper-bounding technique in which the delayed channel 
state is considered as part of the channel input. We characterize the directed information from the new channel 
input (including the source and the delayed channel state) sequence to the channel output sequence. Next, we 
majorize the set of considered channel inputs in order to obtain a method to evaluate the upper bound. Our system 
is different from that in [28], which assumes that the channel output is fed back with delay and the state information 
is available with delay at both transmitter and receiver We construct two nested sequences of upper bounds for 
feedforward and feedback capacities, respectively. Through three theorems, we show that the upper bounds can be 
achieved by finite-order conditional Markov sources, conditioned on the delayed feedback (FB) and the delayed 

'Note that for some special non-controllable FSCs such as the Gilbert-ElHott channel [4] and certain Gilbert-Elliott-like channels [26], the 
capacity-achieving distributions are known, and the feedforward capacities can be evaluated using the tools in [5-8]. For general non-controllable 
FSCs, however, closely bounding the feedforward capacities and the feedback capacities seems to be the only practical approach. 
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State information (SI). Similar to [14], we formulate the computation of the upper bound as an average reward per 
stage stochastic control problem [29,30] and propose a dynamic programming method to numerically solve this 
problem. Using a quantized value iteration algorithm we obtain quantized optimal conditional Markov sources, and 
numerically evaluate the upper bounds by Monte Carlo methods. 

Structure: The rest of this paper is structured as follows. The channel model is given in the next section. We 
introduce the channel capacities of non-controllable FSCs with/without feedback, and develop the upper bounds on 
the capacities in Section III, followed by three theorems to facilitate the computation of these bounds in Section 
IV. In Section V, we introduce dynamic programming methods to optimize the source of channel input sequences 
and then compute the upper bounds by Monte Carlo methods. Some numerical results are presented in Section VI, 
followed by the conclusion in Section VII. 

Notation: A random variable is denoted by an upper-case letter (e.g. X) and its realization is denoted by the 
corresponding lower-case letter (e.g. x). A vector of random variables [Xi, Xi+i, . . . , Xj] is shortly denoted by 
Xf and its realization is denoted by x^. By default, we set X^ = X( and = x{. The cardinality of a set X is 
denoted by \X\. 



Let St, Xt and Yt denote the channel state, the channel input and the channel output at time i G Z, whose 
realizations are st, Xt and yt, respectively. Each state st is drawn from a finite alphabet S, each input letter xt 
is drawn from a finite alphabet X, and each output letter yt is drawn from a finite or continuous alphabet y. For 
ease of notation, we shall consider only the case where the set 3^ is finite in this paper It is straightforward to 
generalize our results to the channels with continuous output by replacing the probability mass function Pr(?/) with 
the probability density function f{y). 

More specifically, a finite-state channel (FSC) has a state sequence s = Sq,Si,S2t ■ ■ ,Sn, an input sequence 
X = xi,X2t ■ ■ ,xn and an output sequence y = yi,y2, ■ ■ ■ ,yN ■ As in [27], we assume that, for an FSC, the 
channel output yt and channel state st are statistically independent of all channel inputs, outputs and states prior 
to Xt, yt and St-i, if the channel input xt and channel state st~i are known. That is. 



A non-controllable FSC [10] is a finite-state channel in which the user cannot control channel states, i.e., the 
channel state evolves freely in the sense that, conditional on st-i, the state st is statistically independent of x* and 
y*. Specifically, we assume that the channel state sequence forms a non-controllable stationary irreducible Markov 
chain in the sense that 



This equation implies that PT{st\xt, St-i,yt) = Pr(st|st-i). Therefore, a non-controllable FSC can be described 



II. Channel Model 




(1) 



Pr(st|4-\x*,2/*) = Pr(st|st_i). 



(2) 



statistically by the conditional probability Pr(?/i, St I x*, Sq ^, y* ^) satisfying 




(3) 
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Fig. 1. A trellis section of the RLL(1, oo) sequence. 




St-i = 9 St-i = b 



Fig. 2. A Gilbert-Elliott channel. 

From the assumptions on the FSC discussed above, additional characteristics of the channel used with feedback 
and without feedback can be distinguished as follows. If there is no feedback, then given the channel state st-i 
and channel input xt, the channel output yt and state st are statistically independent of other channel inputs and 
prior channel states and outputs, i.e.. 



On the other hand, if feedback is allowed (precisely speaking, the output sequence y*"^ is available at the transmitter 
before emitting symbol Xt), then the channel output Yt statistically affects the future channel inputs X^j^. Therefore, 
instead of (4), we have 



The non-controllable FSC will be illustrated by the following example related to the Gilbert-Elliott channel. 

Example 1 (The RLL{1, oo)-GE Channel): The channel input is required to be a binary run-length-limited (RLL) 
sequence satisfying the RLL(1, oo) constraint, i.e., there are no consecutive ones in the sequence (see Fig. 1). The 
channel is a Gilbert-Elliott channel with two states (see Fig. 2), a "good" state and a "bad" state. Denote the channel 
state alphabet by S = {g,b}. The transition probabilities between channel states are p{b\g) ^Pi^St = ^'|>5't_i = g) 
and p{g\b) = Pr(5t = g \St^i = b). When the channel state is "good", i.e., St = g, the channel acts as a binary 
symmetric channel (BSC) with cross-over probabiUty Eg. When the channel is "bad", i.e., St^b, the channel is a 
BSC with cross-over probability et- □ 




(4) 




(5) 
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III. Channel Capacities and Upper bounds 

A. Channel Capacities 

In order to unify the presentations of both channel capacities (the feedforward capacity and the feedback capacity), 
we use the notion of directed information, which was introduced by Massey in [15]. For any given joint probability 
distribution Pr(X^,y^), the directed information from the channel input sequence and channel output 
sequence is defined as 

N 

I {x^ -^Y^)^Y1 ^(^*; ^* 1^*"') • 

It has been shown that / (^X^ F^) < / [X^; Y^^ with equality if the channel is used without feedback, and 
that the directed information can characterize the feedback capacity of an FSC as shown by the coding theorems 
in [18,23,24]. For simplicity, we denote 2 (X — Y) as the directed information rate^ from the channel input to 
the channel output, that is, 

I(X^Y)^ lim -I(X^^Y^). (6) 

In this paper, we use the following definitions of channel capacities for FSCs. 
Definition 1: The feedforward capacity of an indecomposable FSC is given by 

C = sup I{X Y) (7) 

{Pr(xtlx'-i)} 

where that the supremum is taken over all stationary channel input processes. The feedback capacity of an 
indecomposable FSC is given by 

C^^ = sup I{X Y) (8) 

{Pi-(xt|a;*-i,y'-i)} 

where the supremum is taken over stationary channel input processes that are causally dependent on the past channel 
outputs. This means that all past channel outputs must be fed back to the source before emitting the symbol 

Xt. 

For a controllable FSC, such as the intersymbol interference channel, Yang et al. [14] proved that the feedback 
capacity can be achieved by feedback-dependent Markov sources with the same memory length as the channel's. 
They also showed that the (delayed) feedback capacity can be estimated by dynamic programming and can be 
utilized to tightly upper-bound the feedforward capacity. For a non-controllable FSC, since the user cannot control 
channel states, the computation of capacities in (7) and (8) is cumbersome. The main objective of this paper is to 
develop numerically computable upper bounds on the capacities of non-controllable FSCs with/without feedback. 

-We note that the limit in (6) may not exist for all channels and all sources. In that case, the "lim inf " should be used instead of "lim". 
However, in this paper, we consider only indecomposable channels and stationary channel inputs, in which the limit in (6) does exist. 
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B. Upper Bounds on Capacities 

Motivated by but differing from the work of Tatikonda [16], in which the perfect channel state is assumed to 
be known at the receiver, we make an assumption that the channel state is available at the transmitter and insert a 
state sequence into the definition of directed information as follows. 

Definition 2: For stationary processes {Xt}, define ly {X, S* — > F) as the following information rate"* 

lim -V/(X*,5*""-i;yt|r*-i). (9) 
t=l 

□ 

Evidently, Ty {X, S ^ Y) is a directed information rate from the new channel input (including the source and 
the delayed channel state) to the channel output. In other words, in (9) we consider the delayed channel state as a 
part of the channel input. Obviously, for a given channel input process, there is a nested sequence of upper bounds 

onI{X ^ Y) as 

I{X ^Y) <■■■ < {X, S ^ Y) < ly {X, S ^ Y) < ■ ■ ■ < la {X, S ^ Y) . (10) 

Furthermore, if we take the supremum over the corresponding sets of stationary input processes, the capacities in 
Definition 1 can be bounded as 

C < sup ly (X, S ^Y) 

{Pr{xt\xt-^)} ^^^^ 

Cf^ < sup Xy (X, S ^Y). 

{Pr(a:t|2;*-i,i/*-i)} 

These upper bounds, however, can not be easily evaluated because the source sets are too general to be specified 
with few parameters. To develop simpler expressions for upper bounds, we need define the following sources in a 
similar way to that in [31]. 

Definition 3: Assume that the M-delayed output feedback (FB) y* and the v-delayed state informa- 
tion (SI) S'o^"'"^ are available at the source just before the emission of Xt (see Fig. 3). Then the channel input 
Xt could be selected according to a preset conditional probability law Pr(a;t , Sq~"~^ , y*^"^^) . All such input 
processes {Xt} are described by a set V{u,v), i.e., 

V{u, v) ^ {Pr(xi } . 

In words, V{u,v) represents the set of all sources (channel inputs) with u-delayed FB and u-delayed SI. □ 
Note that the delays u and v are both non-negative. An important subclass of sources from V{u^v), called 

conditional Markov source, is defined as follows. 

Definition 4: For v < m, a source sequence {Xt} used with u-delayed FB and w-delayed SI is said to be an 

m-th order conditional Markov source if the conditional probability mass function satisfies 

Pr{xt\x'-\ y*— 1) =Pr{xt\xlzL s^I^A, y*—^)- 

'in Section V, we will see that maximizing I„ {X, S Y) is considered to be a stochastic control problem, which has a stationary policy 
as a solution and implies that the limit in (9) does exists for those distributions that maximize Xv {X, S — ^ Y), see [14, 16,29,30]. 
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Fig. 3. A non-controllable FSC model with M-delayed FB and i)-delayed SI. 



Let Vmiu^v) represent the set of all such sources, that is, 

V,r.iu,v) ^ {Pr(.T, |x*li„,sj::;-\,y*— . 

□ 

From the definitions of sources v) and Vm{u, v), we have the following facts for non-negative u, v and m. 
m The sets of stationary channel input processes {Pr |.t*^^)} and {Pr (xt |a;*~^, y*^^ ) } are subsets of the 
conditional source sets V{u,v) and 1^(0, v), respectively. 

. V{u + 1, w + 1) C 'P{u + 1, u) C V{u, v) and Viu + 1, w + 1) C V{u, v + I) C V{u, v). 

. If w + 1 < 771, then V,n{u + 1, u + 1) C V,n{u + 1, w) C P,„(u, v) and + 1, w + 1) C P,„(it, v + 1) C 

Vmiu.v). 

• If u < m, then ■p,„(u, v) C'P,„+i(u, w) C • ■ ■<Z'P{u, v). 

Moreover, we can prove the following proposition. 

Proposition 1: For a non-controllable FSC with sources in the set V{u, u), 

PT{yt, St |x*+",4-\ y*-i) ^Pv{yt, St\xt,St-i) . (12) 

□ 

Proof: From equation (5), we have 

p I I , Pr(x*+^|xt,sti:yt) 
= Pr(yt,st|xt,st-i) ^. 

Pr(2^tTi|a;t,st-i) 

By Definition 3, given xt and st_i, the inputs Xj^" are statistically independent of the output yt and the state st 
which implies that the fraction above equals 1. ■ 

Proposition 1 implies that the probabihty Pr(?/t, st|a::*+",Sg~^, y*~^) are unaffected by the source selection from 
V{u,u) and that the probability Pr(yt, sj [x'+^jSg"^, y*~^) can be characterized by the channel only. From the 
definition of sources V{u,v), we directly introduce a supremum as follows, which will be shown to be an upper 
bound on the capacity of the non-controllable FSC. 

Definition 5: Define Ip^ gj{u,v) as the supremum of the information rate ly {X,S -> Y) over all sources 
V{u,u) used with u-delayed FB and M-delayed SI, that is, 

IpBsii^.v)^ snp lyiX.S ^Y). (13) 

Viu.u) □ 
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Combining the inequalities in (11) with the discussion after Definition 3 and 4, we conclude this section with 
the following proposition. 

Proposition 2: 1) For any u > and i; > 0, we have 

1-*FB.SI (m + 1, W + 1) < 'I*FB.SI C" + 1; < 1-*FB,SI 

and 

1-*FB,SI (u + 1, W + 1) < X*FB,SI f + 1) < 1-*FB,SI ^) • 

2) For any w > 1, we have a nested sequence of upper bounds on the feedforward capacity 

C< - < I*pB^si (^', ^0 < • • • < I*FBSI (1, 1) < I*FBSI (0, 0) . 

3) For any v > 1, we have a nested sequence of upper bounds on the feedback capacity 

C^' < • ■ • < rFB,si {0,v)<---< Ifb,si (0, 1) < Ifj^^si (0, 0) . 

□ 

Proof: It is straightforward and omitted here. ■ 

IV. Three Theorems for Upper Bounds 

In this section, we introduce three main theorems that simplify the expressions for the upper bounds presented 
in Proposition 2 on the capacities of non-controllable FSCs. 
Theorem 1: Let w > 0. For non-controllable FSCs, 

I{X\Sl-^-^-Yt\Y'-^) = I{Xl_.^„St-..^-Yt\Y'-^) (14) 

and the information rate ly (X, S* — > F) in (9) can be simplified as 



I„(X,5^y)= lim 1 V/(X*_,,5t_,_i;yt|y*-i). (15) 

N^oo I\ ^ — ^ 



Proof: See Appendix A. ■ 

Theorem 2: Let Q <u < v. The supremum Ipg sii^' achieved by a u-th order conditional Markov source 
with u-delayed FB and u-delayed SI, that is, 

2^FB.s/("'^) = sup Xy{X,S^Y) 

Where V,[u,u) = {Vx{xt s^:;;!^, y*"""!) }. □ 

Proof: See Appendix B. ■ 
By Theorem 2, to evaluate the supremum Ip^ s/l"'^)' necessary to search the whole set of conditional 
probabilities {Pr(a:t s'l^Zi , 2/*^"^^) , ^ = 1, 2, . . . , }. As time t increases, the space of sequences 

expands exponentially, which makes it complicated to keep track of the dependence of the process Xt on y*'"^^^. 
In the sequel, we find some finite-size sufficient statistics to represent the sequence 
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Let A4 be the Cartesian product X"" x S"" whose elements are indexed simply by ^ G {0, 1, • • ■ , M — 1} 
with M = \A4\. A random vector Af. is specified as the a posteriori probability vector with realization 

at = [at{0),at{l)r-- ,at(A/-l)] (16) 

where 

at {£) ^ Pr{{Xl^^„Slz:) = I ly'"") (17) 

for ^ G {0, 1, • • • , A/ — 1}. The sample space of the random vector is denoted by A, which is a simplex in R^^. 
That is, A = {a= [q;(0), . . . , a{M — 1)] : a{i) > 0, J2iLo^ Q^(*) — !}■ Given at_i, yt~u and the set of transition 
probabilities {Pr(xt sJl^Zi , y*""^^) }, we can use the forward recursion of the BCJR algorithm [32] to 

compute all values of at{i) as 

E Pr(x*_.,s*::_i,yt-„|2;*-"-i) 
at{xl_^+,4-v) = ^'~^'^~^" , t — 1 t-u-i\ ^^^^ 

t 

where 

s'-lzl) Pr(xt|x*li,s*::ii\y*"""')Pr(2/t-u, st-u\xt-u. ■ (19) 

The equality (a) results from Proposition 1 and the assumption u < v. From (19), we know that, once the prior 
conditional probability vector a^_i is given, the current conditional probabiUty vector depends only on the 
current transition probability Pr(a;t |a;jI^,S(I"Z;[,y*~"~^) and the channel transition law. To shorten the notation, 
we abbreviate (18) and (19) as 

fit=Fscji?,(«t_i,{Pr(a^t|a;*:i,s*::Z^y-"-i)},yt_„). (20) 

Evidently, the vector at-i depends on y*^"^^, and two different sequences and may result 

in the same at-i- For an arbitrarily selected source from Vv{u,u), two different sequences and 
may have Pr(a;( [ccjl^, s*I"Z^, ^Pr(a;t|a;*I^, SjI"Z;[,2/*^"^^). However, there do exist sources such that 

different and resulting in the same vectors a^_-^ ~ Q.t~i have Pr(a;( |x*zi, s*Z"Zi , 2/*~"~^) = 

Pr(a;t |a;*zi, SjZ^Zi , Such a subclass of Vv{utU) is defined as follows. 

Definition 6: The set 'P[,{u,u) collects all the w-th order conditional Markov sources with w-delayed FB and 
u-delayed SI such that 

D^/" I t-1 t-U-1 t-U-l\ D,/- I t-1 t-U-1 ~t~U-l\ 

whenever a^.^ = a_f_^. Hence, the source set 'P'^{u,u) can be shortly denoted by 

V[,{u,u) ^ {\>r{xt\x\-_ls\-_r-l&-i)] ■ 

□ 

Fig. 4 depicts the non-controllable FSC model, whose source belongs to the set V'^{u,u). 
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Fig. 4. A non-controllable FSC whose source is in the set 'P'^{u,u). 



Theorem 3: Let u<v. The supremum Ipg gj{u,v) can be achieved by a source in the set 'Py{u,u), that is, 



•p^,(tl,«) 



where V',{u,u) = {Pr(a;t Iz^lL s*"""^ 



''t — D7 "t — V- 



(21) 
□ 



Proof: See Appendix C. 



V. DYNAMIC PROGRAMMING FOR SOURCE OPTIMIZATION 
A. Stochastic Control Formulation 

From Theorem 3, we only need to consider the sources in the set V[j{u, u). In this setting, for any given y*^"^^. 



we have 



Pr(x*_„,Si_._i,y,*_J2/*-"-i; 



(a) 



(b) 



E/' t-l t-U-l\ D,/' I t-1 t-U-1 \ 



(22) 



where equality (a) results from Proposition 1 and the assumption u < v, and equality (b) results directly from the 
definition of the source set V[,{u,u). Similar to equation (40) as shown in Appendix B, we can prove that the 
conditional probability Pr(yj_jj_|_j^ [xj^^jS^I^^]^) is completely determined by the channel law. Therefore, equalities 
in (22) indicate that the joint conditional probability mass function on the left-hand side of (22) is not sensitive to 
the vector (that appears in the conditioning clause) but to its induced variable a^_i. This implies that 



I {^t-v^ St-v-iiYt\Yt-u+nVt~m V 



(23) 



of which the right-hand side is a function of {Pr( a^t ja^t-i, St_"_i , St-i) } and yt-u- For simplicity, this 

function is denoted as 



(24) 
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Therefore, we can rewrite the information rate Iu(X, S ^ Y) in (15) as 

1 ^ 

UX,S^Y) = lim-^/(X*_,„,5i_._i;yt|y/„-Vi'>^*'") 



lim — EyN- 



t=i 

N 



(25) 



Substituting (25) into (21), we can see that the problem to find the upper bound Tpg ^/(w, v) is equivalent to an 
infinite-horizon average reward per stage stochastic control problem [29,30]. In fact, the stochastic control system 
has the form 

at = FBcjR{at_i,^r:{xt\xl-lA-v~i&-i)}^yt~n) (26) 



where af_i £ ^ is the state of the dynamic system, {Pr(a;t|x*I^, s*I"Zi,at_i)} is the policy or control, and 
yt-u is the system disturbance. The reward function at stage t is 5 (a(_]^, {Pr(a;t [x^Z^, SjI"Zi , S.t_i) } , yt-u)- 
The Bellman equation [29,30] is, for any state a^^i G A, 

<y + Jfe-i) = max {$ (a,_i, {PT{xt\xlzl4z:zla,_,) }) + E [j{A,)\a,_,] } (27) 

{u,u) 

where a is the optimal average reward, J{a) is the optimal relative reward-to-go function, is the random 
variable with realization whose randomness depends on the disturbance variable Yt-u, and 

^{at_^,{Vr{xt\x\zls'-lzlat_^)]) = Ey,_„ [g {at_^,{Vr{xt\x\zls\ZlZl,at_,)] ,Yt-u)] 

= /(X*_„,5t„.„i;yt|r/_-\a,_i). (28) 

Proposition 3: The system disturbance variable Yt-u is characterized by a conditional probability distribution 
that depends explicitly on the system state a^_-^ and the policy |Pr(a;t [xjlj,, SjI"Zi , "t-i) }■ 

Proof: Given the system state at_i and the policy {Pr(a;t [xjl^, Sjl"z|, a^.j) }, the probabihty mass function 
of the system disturbance can be explicitly determined as 

Pr I at-i , { Pr{xt | a;*:^ , 5*1^1^ 1) } ) 

= H^Unstt-i^yt-u\at-^,{PT{xt\xlzl,slZ'iZl,at_,)}) 

t t-u 

J2 at-i{xtl/t^^Pr{xt\xt^,/t^-lat_,)Priyt^,St^^^^ (29) 

where the equality (a) follows from Proposition 1 and the assumption u<v. ■ 
Proposition 4: The state process A^ with realization is a Markov process. □ 
Proof: Equation (26) and Proposition 3 imply that, given the prior state Af_^, the current state A^ is independent 
of iig"^, that is, Af is a Markov process. ■ 
Theorem 4: There exist a scalar a* and a function {J* [a) : a G A) that satisfy Bellman's equation (27). 
Furthermore, there exists a stationary policy to solve the Bellman equation. 

Proof: The considered average reward per stage stochastic control problem can be reformulated as a discounted 
cost dynamic problem by [30, Ch. 4]. For this dynamic problem, we have the following facts. Firstly, the state 
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space ^ is a Borel subset of a Polish (complete, separable metric) space. Secondly, the policy space {Pr(j|i,a)} is 
a compact metric space. Thirdly, the reward function <1> is bounded and continuous. Fourthly, once the state at-i 
and the policy are given, the transition probability of the next state at is determined. Then, we can complete the 
proof of this theorem by the theorem in [33, Ch.6, Theorem 4.1]. 

An alternative proof of this theorem follows from the proposition in [30, Proposition 4.6.5]. ■ 
From Theorem 4, it suffices to investigate only stationary policies. For convenience, the stationary policy in the 
set Py{u, u) is denoted by 

{Pr(j \t,a)}^{Pv{Xt=j\{XlzlSlz:il) = = a)} . 

We note that the information rate Xy{X,S — > Y) in (25) induced by a stationary source {Pr(j|i,a)} can be 
computed using Monte Carlo methods similar to those in [5-8]. 

B. Value Iteration Algorithm to Optimize the Source 

For an average reward per stage stochastic control problem, there exist several efficient dynamic programming 
algorithms (such as value iteration, policy iteration and linear programming) [29, 30] to solve Bellman's equation 
and find an optimal stationary policy. In this subsection, a value iteration algorithm to optimize the stationary 
Markov source {Pr(j is described. Initially, we choose a terminal reward function Jo(a) = for any a. 

Then, the optimal /c-stage reward-to-go functions Jk{a) (k ~ 1,2,...) are recursively generated by 

Jfe(a)= max {Pr(jKa)}) + E [ Jfe_i(A')| a] } (30) 

{Pr{i|i,Q)} 

where ^ is the random variable whose randomness depends on the system disturbance variable Yt-u, and where 
the realization a' of A' can be computed by the BCJR algorithm as 

a[ = FBCJR (fi, {Pr(j |«, a)} , Vt-u) ■ (31) 

Moreover, an optimal policy is obtained at stage k as 

{Pr(jN,«)K =arg max {Pr^lz, a)}) + E [ Jfc_i(A')| a] } . (32) 

{Pl-(i|l,Q)} 

The information rate ly {X,S -> Y) induced by the stationary source {Pr (j |i, converges to the maximum 
a* = Ipg 57 (w, v) when k goes to infinite [29,30]. Thus, the source distribution determined by (32) is an optimal 
source distribution as k goes to infinite. 

In general, it is infeasible to find the optimal stationary poUcy in closed form using the value iteration algorithm. 
The following is a quantized numerical approximation of the value iteration algorithm. 

Algorithm 1 (Quantized Value Iteration Algorithm): 
1) Initialization: 

< Choose a large positive integer n. 

< Choose a finite quantizer a = Q{a). 
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Fig. 5. Bounds on the capacities of the RLL(1, oo)-GE channel. 



• Initialize the terminal reward function as Jo (a) = for all a. 
2) Recursions: 

For fc = 1, 2, . . . , 71, and any a, compute the fc-stage reward-to-go functions as 

Jfc(a)= max |$(a,{Pr(j|i,a)}) + E [ Jfc_i(i') alj. 

{Pr(i|i,Q)}L L JJ 



(33) 



where A' is the random variable whose randomness depends on the system disturbance variable Yt^u, and 
where the realization a' of can be computed by 



a'^Q {Fbcjr (fi, {Pr(j|j, a)} , yt-u)) ■ 



(34) 



3) Optimized source: 

For any a, the optimized source distribution is delivered as 

{Pr(j|«,a)} = arg max ($(a,{Pr(j|i,a)})+E [ J„(i') alj . (35) 

{Pr(j|i,a)} L L JJ 

Remark: To perform Algorithm 1, the stationary Markov source probabilities {Pr(j|i,a)} are approximated by 
their discretized versions. The maximizations in (33) and (35) are then implemented by exhaustive searches. Thus, 
the optimality of the resulting source distribution (35) is limited by the resolution of the quantizers involved. Strictly 
speaking, the information rate 2y {X, 5 — > F) induced by the "optimal" quantized source obtained in (35) is only 
a lower bound on the upper bound I'p^ s/("' Intuitively, finer quantizers should cause less loss of optimality. 
This intuition is verified by the numerical results shown in the following section. 



VI. Numerical results 

In this section, we present numerical results by taking the RLL(l,oo)-GE channel shown in Fig. 1 and Fig. 2 
as an example. We chose this channel because it was already used in a prior publication [10]. In this example, we 
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Cross-over probability in bad state (£{,) 



Fig. 6. Information rates Ii{X,S — > Y) for "optimal" quantized sources in 1) deUvered by Algoritlnn 1 witli different quantizers, 

wliere 5 is tlie quantization parameter 



set the transition probabilities between the channel states as p(6|g)=p(g|6)=0.3, the cross-over probability in the 
"good" state as £(, = 0.001 and the cross-over probability in the "bad" state as a variable £b G [0, 1]. We first apply 
the quantized value iteration algorithm in Section V to optimize the sources and then use Monte-Carlo methods [5- 
8] to numerically evaluate the upper bounds Ip^ gj{u,v). The results are shown in Fig. 5, where Ipg 5/(1, 1) 
and Ipg 57(2, 2) are two upper bounds on the feedforward capacity, and 2pg 5/(0, 0) and Ipg 5/(0, 1) are two 
upper bounds on the feedback capacity. As expected, Ipg 5/(2, 2) < Ip^ 57(1, 1) < Ipg 5/(0, 1) < Ipg 5/(0, 0). 
It is worth pointing out that, due to the RLL constraints, the source must have memory of order at least one 
and the optimization is implemented by taking into account the RLL constraint. In particular, the upper bound 
Xpg sj{0, 0) is obtained by optimizing the sources Pi(0, 0). Also shown in Fig. 5 is a lower bound on C computed 
using techniques presented in [9, 10]. By comparing Ipg sj{2, 2) with the lower bound, we observe that the bounds 
Xpg sj{v, v) are numerically tight upper bounds on the feedforward capacity. We are unable to evaluate the tightness 
of the upper bounds Ip^ gj{0,v) on the feedback capacity since no good lower bounds on C^^ are available in 
the literature for non-controllable FSCs. 

Fig. 6 illustrates the loss of the optimality caused by quantization. We focus on the computation of Ipg 5/(1, 1). 
In this case, the state variable a has four components. Let (5 > be the quantization parameter. The quantized state 
variable a is determined by a{i) = ^^j^ *6, where [x] stands for the closest integer to x. If necessary, slight adjust- 
ment is required to guarantee that a{i) = 1. For example, for 6 = 0.1, the vector a ~ [0.12, 0.25, 0.375, 0.255] 
is approximated by the vector a = [0.1, 0.2, 0.4, 0.3]. From Fig. 6, we can see that a smaller 6 (equivalently, a finer 
quantizer) induces a larger information rate Iv{X, S ^ Y) and causes less loss of optimality. It can also be seen 
that the gap between the different quantizers is negligible for small quantization parameters S. 
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VII. Conclusion 

We have developed a universal form of upper bounds on the capacities of non-controllable FSCs with/without 
feedback. We obtained two nested sequences {ipg sii'^^ ^)} {^fb sii^' ^)} °f upper bounds on feedforward 
capacities and feedback capacities, respectively, where the upper bound Ipg v) is the supremum of the directed 
information rate I„ {X, 5 — > F) from the new channel input (including the source and the delayed channel state) 
to the channel output. These upper bounds can be achieved by v-th order conditional Markov sources V'f,{u,u) 
with M-delayed output feedback (FB) and li-delayed state information (SI). Moreover, the computation of the upper 
bounds sii'^^'^) '^^^ formulated as an average reward per stage stochastic control problem [29,30]. Using 
a quantized value iteration algorithm, a quantized (sub)optimal conditional Markov source is obtained and upper 
bounds are numerically evaluated using Monte Carlo methods [5-8]. 

Appendix A 
Proof of Theorem 1 

Proof: For any u > 0, by using chain rule for mutual information, we have 

I{X\Sl-'"-^-Yt\Y'-^) ^ I{Xl_^,,St-.-i-Yt\Y'~^) + I{X'~"~\Sl-^^^ (36) 
The last term equals zero, since Yt is independent of Sq~'"~'^ and X*^^^^ if st-v-i, and t/*^^ are given. ■ 

Appendix B 
Proof of Theorem 2 

Proof: Let Vi eV{u, u) be an arbitrary source with w-delayed FB and w-delayed SI. Denote the corresponding 
information as I (^Xl_^^ St-v-i\Yt\Y^~^) . To prove Theorem 2, it is sufficient to show that there exists a 
conditional Markov source V2 in Vv{u,u) C V{u,u) with the same information I (^Xl_^, St-v-i',Yt\Y*^^y 
To do this, for any given Vi eV{u, u), we construct a new source V2 € Vy{u, u) as 

Pr(^^)(x,|x*-^4— ^y*--i)^Pr(^^)(x,|x*li,.*— 1) (37) 

with the initial probabiUty as Pr^^^^ (x", Sg"", y"''^) =Pr(^i) (x", Sq"", 

In the following, we will prove that both Vi and V2 induce the same joint probability distribution Pr(x*_„,St_i,_i,y*), 
which, together with the result of Theorem 1, completes the proof of Theorem 2. 

Actually, for any source with u-delayed FB and u-delayed SI, we have 



,t-Tj-l t-v-2 t- 



-=i 

X PT{yl^,\x\s'^,y'-^). (38) 



-1^1 „t-i,-2 t-u r = l 

•"0 '"t-v 
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The conditional probabilities Pri^yr-^, Sr-u |a;%So^ ^,2/^^ ^) and Pr(y(_^^;^| a;*,SQ~", ?/*"") are all independent of the 
source, since 

and 

t 



(c) 



(d) 



5] n Pr{yr,Sr\x\sl\y-~') 

X! n Pr{yr,Sr\Xr,Sr-l) 



= Pr(yt„+JxJ_„s*:^i,y*-') (40) 

where equalities (a), (b), (c) and (d) result from Proposition 1 and the assumption u < v. Equalities (a) and (c) also 
state that the conditional probabilities Pii^yr-^, Sr-u \x'^,Sq~^~^ ,y'^~^~^) and Pi{yt~u+i \ 2;*,SQ~^,y*~") are completely 
determined by the channel transition law. 

Therefore, the given source Vi (zV{u, u) induces the joint probability as 



rrt-v-1 „t-^-2 t-u T — V-\-l 

X Pr(y._^, s,^ \xl^^sl:^ly^-^-') Pr(y*_„+i| xU/^,,y'-^) (41) 

and the conditional probability as 

fr [Xt\Xt^,St^i,y ) - (p^w . 1 . (^^) 

V t-^"''t--!>-i'y / 
Y: Pr(^i)(x*,4---iy*^-i) 

E Pr(^^)(x*-i4-«-i,y*—i) 

where 

Pr(^^)(x* 4—1, y*-"-!) =Pr(^^)(x*-i,4--\2/*—i) Pr(^^)(x,|x*-i,4-'-\y*— 1) 

and 



(43) 
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On the other hand, the source V2 EVviu, u) constructed as (37) induces the joint probability as 



Ct-^^S*-"-^ ,st" T = V + 1 



t 



^t-i^l st-"-2^^t-u T=v+1 V'^-r--!"''T--i^l'i' 



X Pr(y*_„^j| a;*_^,SjI^^;^,y*~") 

= Pr(^i)(.T*_,,st_,_i,y*) (44) 

where equality (e) follows from the construction of the source V2, equality (f) results from the conditional probability 
in (42), and equality (g) is obtained by summing and canceling the numerators and the denominators in successive 
fractions starting at r = w + 1 and considering Pr^^^-'(a;" s^,y"~"). 

Equalities in (44) imply that the source V2 £Vv{u, u) C7'(u, u) induces the same information I (X^_^,St-v-i',Yt\Y* 
as the source Vi £V{u,u) does. Since Vi is chosen from V{u,u) arbitrarily, the supremum I*pB gj{u,v) can be 
taken over the set of conditional Markov sources Vviu, u) instead of over the set Viu, u). ■ 

Appendix C 
Proof of Theorem 3 

Proof: For convenience, the conditional probabilities Pr(a;t [zjlj,, SjI"Z^, y*~"~^) and Pr(xt [x^lj,, Sjl^zj , 
are both referred to as policies at time t. To prove Theorem 3, we shall show that the vector of the a posteriori proba- 
bilities can be used to replace the delayed feedback for the purpose of determining the optimal policies 
to achieve the supremum Iy{X, S Y). First, we show that Bellman's principle of optimality [29,30] holds. For 

N 

any time instant T in the interval [1, N], we decompose the information rate ^ l{Xl_^, St-v-i] ^tj^*^^) as 

t=i 



N T-1 



t=i 



■ N 

(45) 



Similar to (38) in the proof of Theorem 2, we have 



T-1 



Y nPr(a;r|a;^_^,s^:^J,y^"" ^)Pr(sr-u,yr-ula;r-u,Sr-u-i) Pr{y^_^\xlj._\,ST-^-i) (46) 



T-u-li-=l 
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which is independent of policies after time T, i.e., independent of the pohcies in the set { Pr(a:;t |a;*li, s*I^J, \T<t 
Therefore, if optimal policies from time 1 to T are given, then the corresponding policies after time T must 
be optimal in the sense that they maximize the last term of (45). Thus we have proved Bellman's principle of 
optimaUty [29,30]. 

Next, we show that if after time T we utilize pohcies 

{ PT{xt\xlzl , 4-_:zl , aj,_i , yt,---') | T < t < iV} 

instead of the general policies 

{Pr{xt\xlzlslz:zl,y^---\y'T--V)\T<t<N} 

we can still maximize the last term in (45). To show this, suppose that two different sequences j/^^"^^ and y'^~"~^ 
induce the same a posteriori probability vectors arp_^ and arp_i, that is, for all (x^I^, s^r"^^\), we have 

For the different sequences y'^^^^^ and j/"^^"^^, if we use the same policies after time T, i.e., for all t in the 
interval T<t<N, 

then we have 

13^1 N N~v~l N I T-'u-l\ r),/ N-u N I T-u-l\ 

N-u 

Ep T-1 T-u-l| T-u-l\ p / TV N-u N-u\ T-1 T-u-1 T-u-l\ 
"Ft— u''*T-i;-l |y ) ^'^V'T ^^T-u^Vt-u rT-v^^T-v-nV J 

^ "l,yW-M+lpT-ti'*T-i.-l'i' ) 



N-u 
"N^ 



(a) 



N-u 



N 

X 

T = T 



J^Pr(i^|a;^_^,s^_^2^,y ^Vt-^u )P'"(?/t— ui'S,— uI^^t- ui^t— u— i) 

T = T 



N 

X 

r=T 



r=T 

Pr(x^_„,4--\2;^^-j2;^-"-^) (47) 



where equalities (a) and (b) result from Proposition 1 and the assumption u <v. The equalities in (47) imply 

N N 



t=T t=T 
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Therefore, the optimal poHcies after time T for must also be optimal for y^^"^^, and vice versa. Since 

and y^"""^ induce the same vector aj^_i —Qj^^i, a_T-i can be used instead of and the optimal 

policies after time T can be replaced by 

{ Pr(x* I x\-_l s\Zlz\ , fiT-i , VT-u'i T<t<N]. 

Since T is chosen arbitrarily, the optimal source in the set li) = {Pr(a;t [xjl^, s*!"^! , at_i) } achieves the 

same supremum Ipg sii'^^ '■^^ optimal source in the set Vv{u, u) does. ■ 
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